Skip to content

Ploughshare download#127

Open
andrpie wants to merge 16 commits into
mainfrom
add_download_ploughshare
Open

Ploughshare download#127
andrpie wants to merge 16 commits into
mainfrom
add_download_ploughshare

Conversation

@andrpie

@andrpie andrpie commented Apr 14, 2026

Copy link
Copy Markdown

Addresses #102 and NNPDF/pinecards#192.

This is the initial implementation of the class Plough that downloads the grids from ploughshare and converts them into pineappl format (or does whatever is necessary: cutting bins, renaming grids etc.)

Pinecard structure:

The folder would contain two files: ploughshare_link.txt and process_grids.sh (optionally postrun.sh and metadata.txt). Please have a look at the CMS_2JET_8TEV_3D example.
ploughshare_link.txt: contains the link to the file that has to be downloaded
process_grids.sh: responsible for conversion etc.

Class structure

Plough.run(): downloads the (tarball) file and extracts it in the output folder
Plough.generate_pineappl(): runs the process_grids.sh script inside the output folder (also tells it the grid names)

These methods are run when the class is initialised. If they were run conventionally (i.e. by run.py), then an external-pineappl comparison would have to be made at the pinefarm level - this is already done by pineappl import. Also, lines 194-200 of run.py assume that only one grid was generated.

The conversion happens inside process_grids.sh instead of postrun.sh, as postrun.sh only assumes that one grid was generated.

@andrpie andrpie requested review from achiefa and scarlehoff April 14, 2026 16:46

@felixhekhorn felixhekhorn left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also run pre-commit.

Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
andrpie and others added 9 commits April 17, 2026 10:13

@scarlehoff scarlehoff left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @felixhekhorn the run should be its own run

def subcommand(pinecard, theory_path, pdf, dry, finalize=None):

pinefarm will then run run and generate_pineappl (you can make generate_pineappl do nothing and put everything in a postrun.sh script that will be run as part of the postprocess method:

def postprocess(self):

which just runs postrun.sh and, importantly, burns in the metadata.txt into all grids which we want in general.

Comment thread src/pinefarm/external/plough.py
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
@andrpie

andrpie commented Apr 28, 2026

Copy link
Copy Markdown
Author

Now Plough is compatible with postrun.sh and its run is not called in __init__. I've added a condition in run.py to make it work. The pinecard structure got updated accordingly. Thanks for the suggestions!

Comment thread src/pinefarm/cli/run.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated

@scarlehoff scarlehoff left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Left a lot of nitpicks but seems to work (I tested CMS_2JET_8TEV_3D although I could not get grids because of Error: you need to install pineappl with feature fastnlo, which I won't; but I trust it would)

Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/interface.py Outdated
Comment thread src/pinefarm/external/plough.py
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated
Comment thread src/pinefarm/external/plough.py Outdated

@scarlehoff scarlehoff left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! Once @felixhekhorn has time for a second look we can merge I think

@felixhekhorn

felixhekhorn commented May 11, 2026

Copy link
Copy Markdown
Contributor

As we discussed (when @scarlehoff was not around), I'd first like to have all relevant pinecards in NNPDF/pinecards#197 and NNPDF/pinecards#192 available, such that we are sure we have all tools and options we need in practice to discuss the different cases there (as the different datasets may need more or less structure)

@scarlehoff

Copy link
Copy Markdown
Member

Is this already the case? I see many runcards there. Is that all?

@andrpie

andrpie commented May 12, 2026

Copy link
Copy Markdown
Author

For single jets, I think there's one more runcard I should add. For dijets, I need to make the runcards compatible with the current shape of the Ploughshare module.

@andrpie

andrpie commented May 12, 2026

Copy link
Copy Markdown
Author

The pinecards have been updated so that they are compatible with this module now:
dijets (fastNLO grids): NNPDF/pinecards#192
single jets (APPLgrid): NNPDF/pinecards#197

@scarlehoff

Copy link
Copy Markdown
Member

Are all pinecards included now? (also for the one that @achiefa sent the email last week that was missing?) and for the right scale choice?

(Looking for a confirmation, a self-certification of sorts, before merging, I won't check them one by one)

@andrpie

andrpie commented May 19, 2026

Copy link
Copy Markdown
Author

The scale choice is right. For dijets, we've got everything. For single jets, we don't have the following pinecards:

  • CMS 13TeV
  • ATLAS 7TeV
  • ATLAS 13TeV
    Of course, I can add these. However, the need for them has not been flagged with me before.

@scarlehoff

scarlehoff commented May 19, 2026

Copy link
Copy Markdown
Member

Did you add them (the grids)? (I mean, I understand that the pinecards we want are the ones for the grids you have added, other grids are a different story)

@andrpie

andrpie commented May 19, 2026

Copy link
Copy Markdown
Author

Yes, all the grids that I was in charge of generating have been:

  • added to theories_slim
  • their pinecards are in the two PRs above

@felixhekhorn

felixhekhorn commented May 19, 2026

Copy link
Copy Markdown
Contributor

The pinecards currently contain a double information, which should only have a single source of truth:

  1. in metadata.txt there is a field ploughshare
  2. ploughshare_link.txt

EDIT:
Please keep only the ploughshare_link.txt and reconstruct the metadata link from there

@andrpie

andrpie commented May 19, 2026

Copy link
Copy Markdown
Author

Please keep only the ploughshare_link.txt and reconstruct the metadata link from there

This is now done and tested. The ploughshare link from metadata.txt is removed and it is now constructed at the interface level. The result is a key-value pair in a grid's metadata of the form:
ploughshare_link: https://ploughshare.web.cern.ch/ploughshare/record.php?group=applfast&dataset=applfast-atlas-incjets-appl-arxiv-1706.03192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants